Data Analysis and Visualization for Yelp

Motivation

As we all know, America is a country full of diversive culture. People from all over the world come here to work,study and live. Among different elements, food is one of the most interesting topics we want to discover since it is so deeply involved in our daily life. On a large scale, we are interested in the spatial distribution of number of restaurants in different states. Meanwhile, understanding restaurant’s price, rating, number of reviews and their kinds’ distribution will give us a rough idea about our own choice of food. As we went further in the discovery of restaurants, we expected we can find out connection between them by their kinds. All those questions drive us to Yelp, a popular website displaying restaurants information. The source of our data is Yelp’s API and its website and both our analysis and visualization are displayed in the following sections.

Spatial Number of Restaurants Distribution in USA

The distribution of restaurant number in USA on map shows the huge spacially difference. The restarants concentrate together obviouls in california, new york.


Data Collection and Cleaning

  1. Yelp api won't
  2. Web scraping over 20000 pages
  3. Chinese
  4. Alcohol
  5. JanpaneseKorean
  6. American
  7. South American(Mexican)
  8. Southeast Asian
  9. Indian
  10. Europe
  11. Desert

Data Exploration

  1. Descriptive Statistics
  2. Network Analysis

Future Improvement